home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Cream of the Crop 11
/
Cream of the Crop 11-1.iso
/
windows
/
boyer04.zip
/
BOYER.DOC
< prev
next >
Wrap
Text File
|
1996-01-14
|
11KB
|
256 lines
-------------------------------------------------------------------------------
Fast String Search Algorithm (Boyer) for Windows C/C++ Programmers
version 0.4
(Supports Windows 3.x)
by Patrick KO Shu-pui
Copyright (c) 1991-1996 All Rights Reserved.
-------------------------------------------------------------------------------
===============================================================================
0. INTRODUCTION
===============================================================================
This document only provides you the following information:
0.1 Description of the Boyer-Moore Function Calls
0.2 How to use Boyer-Moore functions in your program
0.3 Questions and Answers
0.4 References to the Boyer-Moore Algorithm
===============================================================================
1. Boyer-Moore Function Calls
===============================================================================
Reference of Search Model: (For Clarity)
Problem solved by Boyer-Moore:
Let p, s be strings, and length of p = m, length of s = n.
For a pattern p, find the position i in search space s where
p[0] = s[i], p[1] = s[i+1], ... p[m-1] = s[i+m-1].
Time Complexity of Boyer-Moore Algorithm:
On "average" case, it takes n/m steps to find p in s.
-------------------------------------------------------------------------------
Function : SetFindPattern()
Synopsis : initialize the pattern to be matched, this function must be
called prior to Find() and FindBackward().
Parameter: LPSTR p = pattern string to be matched
Return Value: HFIND = a find handle
Example:
HFIND hfind;
LPSTR pattern = "find me";
hfind = SetFindPattern( pattern );
...
See Also: FreeFindPattern()
-------------------------------------------------------------------------------
Function : Find()
Synopsis : Find a pattern inside a string from s[0] to s[n-1]
Parameter: HFIND hfind = the find handle obtained from SetFindPattern()
LPSTR s = pointer to the start of string to be searched
LONG slen = length of the string s (i.e. n)
Return Value: NULL = pattern not found in string s
non NULL = pointer within s where pattern is matched
at first position (i.e. p[0])
Example: To find a pattern "tiger" in a text in RAM starting at
pointer "txtp" with a length of 1,000,000 characters,
program like this:
HFIND hfind;
LPSTR matchp;
LPSTR txtp;
/* allocate memory for txtp */
...
if ((hfind = SetFindPattern( "tiger" )) != (HFIND)NULL)
{
matchp = Find( hfind, txtp, 1000000L );
if (matchp != NULL)
/* found */
else
/* not found */
...
}
... use hfind many times until finish with it then free ...
FreeFindPattern(hfind);
See Also: SetFindPattern(), FindIC(), FindBackward(), FindBackwardIC()
-------------------------------------------------------------------------------
Function : FindBackward()
Synopsis : Find a pattern inside a string from s[n-1] to s[0]
Parameter: HFIND hfind = the find handle obtained from SetFindPattern()
LPSTR s = pointer to the start of string to be searched
LONG slen = length of the string s (i.e. n)
Return Value: NULL = pattern not found in string s
non NULL = pointer within s where pattern is matched
at first position (i.e. p[0])
Example: To find a pattern "tiger" in a text in RAM starting at
end of pointer "txtp" with a length of 1,000,000
characters, program like this:
HFIND hfind;
LPSTR matchp;
LPSTR txtp;
/* allocate memory for txtp */
...
if ((hfind = SetFindPattern( "tiger" )) != (HFIND)NULL)
{
matchp =
FindBackward( hfind, txtp + 1000000L - 1, 1000000L);
if (matchp != NULL)
/* found */
else
/* not found */
}
... use hfind many times until finish with it then free ...
FreeFindPattern(hfind);
See Also: SetFindPattern(), Find(), FindIC(), FindBackwardIC()
-------------------------------------------------------------------------------
Function : FindIC()
Synopsis : Find a pattern inside a string from s[0] to s[n-1]
and ignore case
Parameter: HFIND hfind = the find handle obtained from SetFindPattern()
LPSTR s = pointer to the start of string to be searched
LONG slen = length of the string s (i.e. n)
Return Value: NULL = pattern not found in string s
non NULL = pointer within s where pattern is matched
at first position (i.e. p[0])
Example: To find a pattern "tiger" in a text in RAM starting at
pointer "txtp" with a length of 1,000,000 characters,
program like this:
HFIND hfind;
LPSTR matchp;
LPSTR txtp;
/* allocate memory for txtp */
...
if ((hfind = SetFindPattern( "tiger" )) != (HFIND)NULL)
{
matchp = FindIC( hfind, txtp, 1000000L );
if (matchp != NULL)
/* found */
else
/* not found */
...
}
... use hfind many times until finish with it then free ...
FreeFindPattern(hfind);
See Also: SetFindPattern(), Find(), FindBackward(), FindBackwardIC()
-------------------------------------------------------------------------------
Function : FindBackwardIC()
Synopsis : Find a pattern inside a string from s[n-1] to s[0]
and ignore case
Parameter: HFIND hfind = the find handle obtained from SetFindPattern()
LPSTR s = pointer to the start of string to be searched
LONG slen = length of the string s (i.e. n)
Return Value: NULL = pattern not found in string s
non NULL = pointer within s where pattern is matched
at first position (i.e. p[0])
Example: To find a pattern "tiger" in a text in RAM starting at
end of pointer "txtp" with a length of 1,000,000
characters, program like this:
HFIND hfind;
LPSTR matchp;
LPSTR txtp;
/* allocate memory for txtp */
...
if ((hfind = SetFindPattern( "tiger" )) != (HFIND)NULL)
{
matchp =
FindBackwardIC( hfind, txtp + 1000000L - 1, 1000000L);
if (matchp != NULL)
/* found */
else
/* not found */
}
... use hfind many times until finish with it then free ...
FreeFindPattern(hfind);
See Also: SetFindPattern(), Find(), FindIC(), FindBackward()
-------------------------------------------------------------------------------
===============================================================================
2. Questions and Answers
===============================================================================
Q: How can I use Boyer-Moore functions in my Windows program?
A: Compile boyer.c into object file and link with your program.
Q: How can I use Boyer-Moore functions in other Windows
frontend toolkits such as SQLWindows(tm), PowerBuilder(tm),
Visual Basic(tm), etc.?
A: Boyer-Moore functions are also provided in form of DLL,
which can be called from these toolkits.
Q: Can I use Find() and FindBackward() with a GlobalLock()
pointer in Windows?
A: Yes.
Q: Must I delcare my pointer as HPSTR (huge pointer) ?
A: Not necessary. Find() and FindBackward() will convert your
LPSTR as HPSTR. However, in your own code you must aware
that you are holding a LPSTR and take care of the pointer
arithmetic and conversion. (see demo.c for example)
Q: What is the limit of the memory space I can search?
A: To the limit of huge pointer implementation and your hardware.
Q: When I use the BOYER.DLL, can the Find() functions be shared
by multiple applications?
A: Yes, as long as you bare the mind that BOYER.DLL functions
are not "handle" type function calls, i.e. BOYER.DLL do not
requires you to acquire a handle first and free a handle later,
which implies that a sequence of SetFindPattern(), Find() calls
can not interleave in multiple applications, which causes
the global data to corrupt.
Q: Does BOYER.DLL supports other data model other than LARGE?
A: All sources are provided as is and you can modify it for
your own model once you become a registered user. Send a
email to the author for other data models if you really want
it mad and do not want to modify the source code yourself.
Q: Does BOYER.DLL or BOYER.LIB supports multi-threaded search?
A: Yes, start from v0.4 BOYER allows the program to create a
HFIND handle for doing a search, where multiple HFIND handles
could be created at run-time in the same program or across
programs.
===============================================================================
3. References
===============================================================================
"Algorithms". Robert Sedgewick. Addison Wesley Publishing Company.
1988. 2nd addition. p286. QA76.6.S435 1983
"Faster String Searches". Doctor Dobb's Journal. Volume 14
Issue 7 July 1989. Costas Menico. p74.
-------------------------------------------------------------------------------